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SECTION I 


INTRODUCTION 


Discrimination analysis has been developed through 
broad phases in much the same manner as the general 
history of statistical inference. There have been the 
Pearsonian phase with the introduction of the coef- 
ficient of racial likeness, the Fisherian phase con- 
nected with the linear discriminant function, the 
Neyman-Pearson phase with the introduction of the 
notions of risk and minimax, and the contemporary 
Waldian phase. Although the coefficient of racial 
likeness and generalized distance, proposed by Karl 
Pearson and P. C. Mahalanobis, respectively are sta- 
tistics to test the hypothesis of homogeneity, these 
statistics were the predecessors of discriminatory 
techniques. It was not until the middle 1930's that 
R. A. Fisher presented the first clear statement of 
the problem of discrimination and the first proposed 
solution to the problem. An excellent survey of the 
literature on discriminatory analysis and related topics 
has been compiled by J. L. Hodges in [4]. 

The general discrimination problem may be classi- 
fied into three principal types as follows: 

(1). A Finite Number of Known Distributions - 
Let X be a random variable which is known to be dis- 
tributed according to one of a finite number of 
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distributions with known density functions, f (x), 
Ñ.—........., Magu the basis of an Observation on X, 
the problem is to determine which one of the m known 
distributions is the distribution of X. 

(2). Fadia Number of Parametric Families of 
Distributions - Let X be a random variable which is 
known to have a distribution in one of a finite number 
of families of distributions. The distributions in 
the j-th family have density functions, £ .(x, 0 9 of 
known form which depend upon the parameter 2 which 
lie in a parameter space б» J = 1, ..., m. On 6 
basis of an observation on X, the problem is to deter- 
mine which one of the j families of distributions is 
the distribution of X. 

(3) 2 Nonparametric = Let T be an individual 
which is known to belong to one of a finite number of 
populations, "od gos ышт. То еаспа лази 
there corresponds an observable value of a random 
variable which could be vector-valued. On the basis 
of a random sample of n; individuals from population 
ur J = 1, ..., m, the problem is to decide which one 
of the m populations contains the individual T as a 
members 

It may be that the only observation available is 


the observation on the random variable, X, to be 


classified, but usually, there are, in addition to the 








observation to be classified, other observations 
available which can be used to estimate the distri- 
butions to which X is to be assigned. 

The nonparametric type of discrimination problem 
has received least attention to date. In [2], Hodges 
and Fix have considered the problem of nonparametric 
classification in the case of two populations and have 
developed procedures which were shown to have asymp- 
totic optimum properties for large samples. In [3], 
Hodges and Fix compared several of these nonparametric 
procedures against the linear discriminant function 
when the two populations are normal with equal covari- 
ance matrices. The linear discriminant function is a 
widely employed classification procedure, and therefore, 
it is of interest to determine the performance of this 
procedure when the populations are not gaussian. In 
[1], Thomas Е. Eaton compared one of the nonparametric 
procedures proposed in [2] against the linear discri- 
minant Function when the two populations were exponen- 
וע‎ ihe basis of comparison in both (1 anaa] 
was the probability of misclassification. ‘This thesis 
is a continuation of the research started in [1]. 

Section II will summarize the procedures and ге- 
sults of [3] as all of the procedures used in this 
paper are analogous. Section III provides a complete 


comparison of the probabilities of misclassification 





oÍ a nonparametric procedure against the linear dis- 
criminant function when the two populations are exponen- 
tial. Section III also includes a limited tabulation 
of the probabilities of misclassification for the linear 
discriminant function when the two populations are gamma 
and one of the parameters has its domain restricted to 
the positive integers. Due to time limitation, it was 
not possible to determine a satisfactory computational 
formula to compute the probabilities of misclassifica- 
tion for the nonparametric procedure when the two popu- 
lations are gamma. Section IV contains conclusions 
and recommendations based on the results obtained in 
Section III. 

I am indebted to Professor J. R. Borsting for his 
encouragement and most capable guidance and 56 
while acting as faculty advisor, and wish to thank 
Professor R. R. Read for his valuable assistance and 
advice as second reader. Also, I wish to thank and 
acknowledge Mrs. Patricia Johnson for programming the 


procedures developed in Section III of this thesis. 





SECTION TI 
PERFORMANCETOF THE LINEAR DISCRIMTNANT 
FUNCTION AND A NONPARAMETRIC DISCRIMINATOR 


WHEN THE TWO POPULATIONS HAVE NORMAL 
DISTRIBUTIONS WITH EQUAL COVARIANCE MATRICES 


let בא‎ ጢው. Ха АША Үш ое 98 ገ be samples 
from the p-variate distributions F and G, respectively, 
and let Z be an observation known to be from either F 
or from G; on what basis is it decided to which popu- 
lation Z belongs? When F and G are p-variate normal 
distributions with equal covariance matrices, the 
linear discriminant function is known to be an approp- 
riate procedure. But what is a reasonable procedure 
when the parametric forms of F and G are not known? 

In [2], Hodges and Fix suggest, as an intuitive 
approach, the following nonparametric procedure: De- 
fine in p-dimensional space a notion of distance which 
will permit a ranking of the 2n observations according 
FO thei nearness tonZz. Them select;an oadd integer, 

k, and assign Z to that distribution from which came 
the majority of the k nearest observations. Several 
classes of these nonparametric discriminators are 

shown to have asymptotically optimum performance in 


the sense that the probabilities of misclassification, 


Р, 511 is assigned to G |Z came from F] 
P, = P[Z is assigned to FÎZ came from G] 


tend, as n tends to infinity, to the theoretical 


> 





minimum values if F and G were completely known. 
Since it would not be reasonable to employ a non- 
parametric procedure solely on the basis of asymptotic 
properties and applicational simplicity, an investi- 
gation is made in [3] to determine how much discrimi- 
nating power is lost through the use of a nonparametric 
discriminator when samples are small. To this end, 
Hodges and Fix assume that F and G are normal with 
equal covariance matrices so that the linear discri- 
minant function is appropriate. Then a comparison of 
the probabilities of misclassification, P, and P,, 
which result when the linear discriminant function is 
employed with the corresponding probabilities P, and P, 
obtained when an alternate nonparametric discrimination 
procedure is used, indicates how much discriminating 
power is lost when sample sizes are small. The remain- 
der of this Section is devoted to summarizing some of 
the procedures and results of [3]. 

The principal distance function compared with the 


iea discriminant function is 


although A is just one of a large class of distance 
functions, anyone of which could be used. This fact 
is mentioned since the probabilities of error, P, and 


Pz, depend very heavily on the distance function 








employed. Also, a great part of the computations are 
made using k = 1, that is, assign Z to the population 

F or G from which came the individual of the pooled 
samples which most closely resembles Z. This case will 
be denoted the rule of the "nearest neighbor." 

By considering linear transformations on the ob- 
Servation space, the problem can be reduced considerably 
Since it is always possible by such transformations to 
ensure F and G will have the identity covariance matrix. 
Thus, the p transformed measurements have unit variance 
and are independent in each population. Also, it is 
possible by such transformations to place the expecta- 
tion vector of F at the origin and the expectation 
vector of G on the positive first axis. In performing 
such linear transformations, the probabilities of 
maselassification, P, and Р,, are unchanged for both 
the nonparametric procedure and linear discriminant 
function. Thus without loss of generality, it is 
sufficient to consider the transformed populations 


with the two parameters, p and A, where 


À BGharst cooudinate of Y) 


distance between the means of the 
transformed populations. 


Furthermore, from the symmetry of the problem it is 
evident that P, = P, for both procedures; consequently, | 
it is sufficient to compute P,, that is, assume Z is 


distributed! according to F. 








For the univariate case, p = 1, the linear dis- 
criminant function is greatly simplified since no 
matrix computation occurs. The procedure consists 
simply of computing the arithmetic mean of the sampie 
means, pm 

KEFY 
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and assigning Z to that population whose sample mean 
lies on the side of (X + Y)/2 as does Z itself. The 
probabilities of misclassification are readily com- 
puted by introducing two new variables which are func- 
tions әл М and Z. The exact procedure is outlined 
in [3], but not included in this summary since the 
subsequent investigation does not depend upon this 
technique. Table 1 provides a tabulation of values 
of P, = P, for various values of n and À. All tables 
in this section have been reproduced from [3]. 

For p = 1, the distance function A corresponds 
to ordinary Euclidean distance and the nonparametric 
procedure using the "rule of the nearest neighbor," 
k = 1, consists of assigning Z to that population from 
which came the sample individual nearest to Z. The 
probability, Pi, that the nearest neighbor to Z is one 
of the Y's, given that Z is distributed as X, is readily 
computed using the following technique. Define P,(z) 
to be the conditional probability that the nearest of 


the 2n sample observations to Z is a Y, given that 





ЖР Hence, 


œ 


(23. P, = E [P,(z)] = TOZ 1 dz 


where f is the density function corresponding to F. 
Бове Ио: па exactly פב‎ іп (3), іс гетаіне өшіп ко 
calculate P,(z). The event, "the nearest sample value 
to z is a Y" may be classified into n exclusive events, 
"the nearest sample value to z is Y: ," 1 = В 2.2 M —. 
where the |Y, - z| are independent identically 
distributed random variables. By defining 


ቨ (573 


SA 65) 
and 


P(|Y - z | > 6), 


K, C5) 
it is readily shown that the density function for the 


шілітиШ оі “не |Y. - z|, 1i 9 l, و سي ټل‎ n is 


4n-1 
- = К, Con ак C5) 


and that P,(z) can be computed by the formula 


со 


(3). Piz) =n | (1-H, (8))" £1-K,(8)IP* aK, (8) 
O 
Formulae (2) and (3) form the basis for all the compu- 


tations for the "nearest neighbor rule" for any p. 
Tables 2 and 2A provide a tabulation of P, = P, for the 
nonparametric discriminator, k = 1, for various values 


ofan and A. 





lãi na 1 |3 | 6.1. larce n, 


le 


n5 C222) 52 
ы" 2 ב‎ 2 - ee 


The above formula was obtained from an expansion of 
formula (3) and is quite general. An application of 
Schwartz's inequality to formula (4), shows the 
integral can not exceed + . 

Also investigated in [3] are the following addi- 
tional cases: 

(i) A nonparametric procedure using ^ as a dis- 
tance function with k > 2 for the univariate and bi- 
variate normal distributions. 

(11) A nonparametric procedure using ^ as a 
distance function with k - 1, n - 1, and p 2 2. 

(iii) The effect of other distance functions on 
the probabilities of misclassification for the bi- 
variate normal distribution. 

Due to laborious computations, the investigation 
of several of the above cases was quite limited, but 
the results that were obtained indicate that the non- 
parametric procedures gave ''reasonable' error proba- 
bilities in cases (i) and (ii). Although for the 
bivariate normal distribution, different distance 
functions produced vastly different error probabilities 


in some instances. 
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ES 0 Ф eee” — ика 7 a مده‎ 
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TABLE 1 


PROBABILITY OF ERROR, LINEAR DISCRIMINANT FUNCTION, 


UNIVARIATE NORMAL DISTRIBUTIONS 


n À = 2 

| 1 4175 „2532 „1235 

| 2 „3821. 21999 „0910 
3 23611 21919 ‚0826 

| ц o 3472 21741 ..0787 

| 5 .3376 ion .0763 

| 10 .3175 ‚1646 .0716 
1 20 „3110 ‚1616 .0692 
| 50 „3091, „1599 „0678 
00 ° 3085 „1587 . 0668 


n = size of sample taken from each population 

А = distance between the means of the two populations 
Probability of error = P (Z is assigned to G| Z came from F) 
= Р (2 із assigned to F | 2 came from < 


ET 








ТАВЬЕ 2 
PROBABILITY OF ERROR, NONPARAMETRIC DISCRIMINATOR 
WITH k=l, UNIVARIATE NORMAL DISTRIBUTION 














n À =1 À = À = 

1 24175 Job „1235 

2 . 1086 ه‎ 236۱ o 106l 

3 „4052 „2307 „1036 

۰101 2280 » 493200 با 
TABLE 2-A‏ 


APPROXIMATE PROBABILITY OF ERROR, NONPARAMETRIC 
DISCRIMINATOR WITH k=1, UNIVARIATE NORMAL DISTRIBUTION 








Д=2 A=‏ = ג 

E ‚403 ‚226 „102 
5 O1 22258 „100 
10 „399 „223 2098 
20 2398. 2224 2098 
50 „398 ‚225 „098 
00 ° 398 „225 „098 





п = 


size of sample from sach population 


А = Gistance between the means of the two populations 


Probability of error 
= P(Z is assigned to F| Z came from G) 


P(Z is assigned to G| Z came from F) 








SECTION TTT 
PERFORMANCE OF THE LINEAR DISCRIMINANT 


FUNCTION AND THE "RULE OF NEAREST NEIGHBOR" 
WHEN THE TWO POPULATIONS HAVE GAMMA DISTRIBUTIONS 


The validity of the linear discriminant function 
when the 4 is obviously not normal has been of 
great concern to many users and also potential users 
of this discrimination procedure. In [1 ], T. E. 
Eaton investigated the performance of the linear dis- 
criminant function and a nonparametric procedure for 
sample size one and two when the univariate distribu- 
tions, F and G, are assumed to be exponential with 
parameters A and u respectively. This investigation 
was performed by computing the probabilities of mis- 
classification. The results of this study showed 
that both the linear discriminant function and non- 
parametric discriminator using A as a distance func- 
tion and “the rule of nearest neighbor" can give high 
ንን. of misclassification for sample size 
one and two. In this section, the investigation 
started in [ 1] is continued in order to provide a 
limited indication of how much discriminating power 
the linear discriminant function and "rule of nearest 
neighbor" have when the populations are not normal. 

The scope of the present study is an investiga- 


tion of the probabilities of misclassification, 


1% 


Le 








P, = P [Z is assigned to G|Z came from F] 

PE PZ ts assapned OTE trom Gl í 
for the two population classification problem when 
the following two procedures are employed: 

(1) The nonparametric procedure employing A as 
a distance function and using the "rule of the nearest 
neighbor," k - 1, when F and G are exponentially dis- 
tributed with parameters A and u, respectively, and 
À = cu where c is greater than zero. 

(ii) The linear discriminant function when F and 
G have gamma distributions with parameters (r, Л) апа 
5.81 тевреспіхеійу, where r is a positive integer, 
and, as above, À = cu where c is greater than zero. 


The density functions of F and G will be denoted 


by f(x;r,cu) and gl(y;r,u) respectively where 


de jd xi 
را‎ ex Cu) = a nn exp(-cux) 
RCE 
and 
r 1-1 
(6). ፪(ሃ፤፤,ህ› = =ጨ- ехр(-цу) 
ICT) 
Obviously, when r = 1 in formula (5) and (6) above, 


Б ከካ) апа g(y; 1, U) are exponential. 

A computation formula for the error probabilities, 
P¡ and P;, will be developed first for the "rule of 
nearest neighbor," procedure (i) above. This procedure 
consists of assigning Z to that population from which 
came the sample individual nearest to Z. 


14 








Assuming equal samples, say n, are available from 
dich population, it is observed that the folowing 


relation, 


СР. рев ( в. 1/С > 
exists between the error probabilities when F and G 
have gamma distributions with density functions defined 
by formulas (5) and (6); hence, this relationship exists 
when F and G are exponential. Using exactly the same 


technique as was outlined in Section II, it is ob- 


served that if Z = z, and 6 > 0, then 
zu 
|е, ТІ 9-22 
_ O 
ВЕСЕ ОРЕ 09 Е - 


2 


Z =Q 
2+6 


۷ 
N 


пищ, if 6 


II 
ו‎ 


K,(6) Р(|Ү-2| > 6) 


2+5 
g(y;r,H)dy, if 6 > 2 


J 
| 


Z= 
we olios trom formulas (2) and (3) of Section II 


that 


መ со 


Pi(n,c) = nị É(z;r,cu)dz [ (1-H C6)}” [1-K (8)]°”” đK (8) 


O x z 7 


+ 8 | f(z;r,cu)dz | [1-H (6)†P HS ЫН dK (8). 


2 O 
Hence, by the simple change of variables, ô' = cô, z! = 
CZ, y' = cy and x' = cx, it follows that 


1) 


A 















со‏ یب 


P¿(n,c) =n | £(z;r,n)đz | [1-H ۳ ከ ከ ак, (5) 


O = 
= 7, 
n n-1 
19. (000007) EL SCA dK, (6) 
O O 
= P. mc) 


Unfortunately, 1t was only possible to determine a suit- 
able computational formula for P,(n,c) when F and G 
are assumed to have exponential distributions. A pre- 
liminary survey indicated that a large computational 
program would be required if F and G are assumed to 
have the gamma distributions defined at the beginning 
of this section. 

When F and G are assumed to be exponential, a 
suitable computation formula for P,(n,c) is obtained 
as follows: First, let z' = uz, 6' = 86, 86 
and combine terms to obtain 


6 


Pi(n,c) = (с+1)(2пс+2п+с) B 
со 7 


2nc exp(-cz-z)dz | [1-2exp(-cz) sinh c8]”, 


0 0 
[1-2exp(-z) sinh 81۳ cosh 8 đồ 


Then by interchanging the order of integration and 
expanding both [1-2exp(-cz) sinh cê J" and 
(1-2exp(-z) sinh Sia into binomial series, it can 


be shown that 
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C 











P1(n,c) = c*1)2nc*2n*c 
n n-1 
US n п-1 ih 
k ] (cktj+c+1) 
к= )=0 
k BE 
k i j p 1 1 
ba (13 H )-1( Бә کي‎ | መመን +2 
| | K , Jp Кеш و 3 و‎ 0 
1=0 p=0 
where Е . . - "(2скт21-2сі-2ріс) 
орак 2 1 


Since P,(n,c) = P,(n,1/c), Table 3 provides P,(n,c) for 
c = 1,2,3, 4 ,10,20 and the reciprocals for a wide range 
of values of n. They by utilizing formula (4) of 
Section II, it is possible to obtain a reasonable upper 
bound for P,(n,c) as n tends to infinity. To begin 
with, it is observed that P,* (c), where Р, (с) їз 


defined as 


Pı*(c) = Lim P,(n,c) = c exp(—cz)dz 
n-» 


с 6+2א-)קא6‎ (+1 °? 


O 
has by Schwartz's inequality an upper bound of +. A 


better upper bound can be obtained for c > 5 and c < 1/5 


by noting that 


с ехр(-сх) < c exp(-xc*x) 


с ехрС-хс*х)+1 с ехр(-хс+х)+1 


for O < x < ә and c » 0; hence, for c > 1, integra- 


tion yields 


со 


ከአር ርር ረር ከን 
P1*(C) £ | с expCxcrxil (c-1) ^ 
O 


therefore, 
17 
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Since it is evident from formula (4) that 

Pewee) = P,%*(1/c) = P;*(c) = P*(1/c). Table 3 contains 
limiting probabilitíries, P*, which weremcompuled by numeri- 
cal integration using Simpson's rule. 

The result that the "rule of nearest neighbor" will 
have, as n tends to infinity, limiting probabilities of 
error of at most # is particularly interesting since, as 
¿will be shown, no such general statement can bey» made for 
the linear discriminant function when the populations are 
characterized by exponential distributions. Considering 
now the linear discriminant function for the case when 
the populations, F and G, are assumed to have gamma 
distributions, a computational formula will be developed 
for the probabilities of misclassification. Again, it 
will be assumed that the samples available from each 
population are equal. Since this procedure consists of 
computing the arithmetic mean, (X+Y)/2, of the sample 
means and assigning Z to that population whose sample 
mean lies on the side of (X+Y)/2 as does Z itself, the 
error probability, P1, is committed if and only if 

Z > (X+Y)/2 and Y > X 
or 

Z (X+Y)/2 Aud ad 
Thus, by the definition of P, it follows that 

РИСО Е + PLZ - ۱۵ ۰ 
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For the purpose of convenience, it is desirable to 
define two new random variables, S and T, where S = nX 
sq = ny. Let the density functions of S and T be 
denoted by f(s;nr,cu) and g(t;nr,u), respectively. 

The probability, P,, can now be expressed more con- 


۱-1 ۱۱ ۰ 5 


۳ و0‎ = РГ > SJ + לוק‎ MS 51 


f(s: neu ds ۶ ۰ 1۲ it Ez recens az 
0 5 מ ב/(++2)‎ 

8 > (L44)/an 
+ 1 5:175 БОР ІС: f(z>r,cwer 


О О 


As in the "rule of nearest neighbor" procedure, it can 
easily be shown by the following change of variables, 
ЖЕЛЕГІ = CL, and в! с св, that the relationship 
between Рі апа Pe is again given by P i(n,c) = P;(n;17c 2 
Since (m ( op CRAIC, IC iS sufficient to 
obtain a computation formula for P,(n,c). The methods 
employed to obtain this formula are now outlined. First, 


it is observed that P,(n,c) can be expressed as 


co -S 
Рр. (0с) = ee 
| 5 а 
+ 2 Тсс cc Sl tr act {CZ n b 
O S (z yt) ZA h 
со وی‎ 
- f s сы ds elt nr رل‎ ] TT CUN 
O O (Хат 
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№м Бу utilizing the well known integration by parts 


formula, n-1 


1 n-1 2 
FOOD X exp(-ax)dx = -exp(-ax) ва. 


iT CaN be shown that 


nr-1 


nr (т к=? 


E 1) ( c+ 1) 
k=0 k 


MI EE 
1 


ао с=с nr+k 


(пг+1-1)! 


26 E (n ۱ 
[(nr-1)4° Ga TAG 
k 


=0 1=0 


+ 


nr+1-1 
Спе Г! 
mien 0 Penn a 
1 r-1 
"`. 1 


|.) nyy *(מ2)‎ 617% 


k k=0 
(nr+i-1)!(nr+k-i-1)! 
MA-ES г 


نا 
|| 
e‏ 
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Table 4 provides a tabulation of the probabilities of 
רת‎ 8 1351 16۹101 Pi(n,c) = P,(n,l/e), for r equals 
ה‎ o=] o m 1775, 1007700 ате е пе” тесір- 
rocals, and a fairly wide range of values for n. 

The probabilities of misclassification for the 
linear discriminant function were also examined when 
unequal samples were available from the populations 
F and G for the special case when r = 1. Using tech- 
niques analogous to those described in the preceding 
paragraph, it is observed that for samples of size n 
and m from the populations described by the distribu- 
tions of F and G respectively, the relationship 
Ee Ween F1 aNd PF 1S 

P,( j=n,i=m,c) = P,(j=m,i=n,1/c) 
where 


1 
[1+1/(2j)1) [1+c/(2i)11 


P,( j=n,i=m,c) = 1 - 


1 j*k-1 1 


— [15i/Cjc2]J ጫጨ. ከ... 
k=0 Ti 
рс e 3+%-1| )156/2(* 
[1+ር/(2331” [c/j+ce+i/j13 E. POS E 
k=0 


۸111 2:821 1711270131161 ofthe error probabilities, Pi 
and P,, when the sample size is not equal, would be of 


some value and interest, time limitations precluded 


al 





the computation of a table which would enumerate these 
probabilities. 

In the special case of r = 1, it was possible to 
determine the limiting probabilities of misclassifica- 
tion. The procedure for obtaining the limiting proba- 
bilities is briefly outlined. When r = 1, the distri- 
butions F and G are exponential, and P, can be expressed 
as co 5 

ONC) SION e Dp i o ל‎ in)di 

O O 
со $ 
ВА (з; пусц)ехр[=сив/ Cam Ое T п, персо Ск de 
О | | O 

which by the change of variables, s' = cu(2n+1)s/(2n) 
and t' = u(2n+c)t/(2n) for the integral appearing first 
1n the above expression for Pine) and = апа 


s' = cus for the second integral, yields 


œ s/c 


ENC) = 1/q(n,c) + £(S melds cune dt 
O O 
መ h(s) 
оаа. с) f(s;n,l)ds e(t dt 
`. O 
where 
h(s) = (2n+c)s/(2nc+c) 
and | | | 
meme) = וק‎ (2n)) ۲ ۱۰/2۵۵ ۲ ۰ 
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Now, if the simple one-one transformation, 


s/(s+t) 


x 
и + T 
is utilized, the above expression for P,(n,c) becomes 


( 1/0/06 = כ 


œ 1 
IAS y2"~!exp(-y)dy to 
O C ሖር +1) 
1 
- 2/Га(п, с Г“ (п)] y" Lexp(-yday О 
© temo) 


which upon integrating out y, can be expressed as 


1 
(ЗЭ. 2.09160) = ИЧЕ Ва, п) ו‎ 
| ЕЕ) 
1 
-27/) )) و1‎ (1 7 "Q E. ах 
One) 
where 
HN) < 02 (2 2۰ no) 
апа | 
B(n,n) » T^(n)/T (2n). 
Since it is evident when c = 1 that P,(n,c) = & for 


all n, it remains only to consider the cases, O < c < 1 
and c > 1. By considering each case separately and 
applying Chebyshev's inequality to formula (8), the 


aprobaba гу of P (nc) 15 
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"0ይ ፡፡:(!1ር፦|)/2|. if ዐ<ር<] 
mye Pa*(n,c) = Lim P,(n,c)=< #, af c = 1 
n— © 

сехр пе 1 
As mentioned previously, the limiting probabilities for 
the nonparametric discriminator, "rule of nearest 
neighbor," are at most +, but from formula (9) it is 
apparent that the limiting probabilities for the linear 
discriminant function are greater than # for 


2 


аео) еее 
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9% 


E 


OD -ч © (n £ Ww N =~ 


መ“ =‏ رم 
о щл о ж‏ 


8 


„5000 
. 5000 
492000 
„5000 
„5000 


5000 


. 5000 
„5000 
. 5000 
«5000 


‚ «2000 


«5000 
93000 


TABLE 3 
ባጀብ) E 
ENTIAL POPU 

3.0 ц.0 
.5262 8 ۰. 1 
۰3560 7 
.3691 5 
۰27۵۱ -3297 
«3808 . 347 ` 
2383238 . 53807 
3853 . 3404 
03868 ., 3521 
.3879 ۰ 5 
.38888 5 
.3913 5 
.3925 . 3489 
.3954 2 اا‎ 


29 


.2359 


۰2693 . 


„2850 


„2938. 


„2992 
93029 
. 3055 
e 3074 


«3089. 
53100. 


23134 
23149 
. 3188 


10.0 


. 1676 
.1831 
„1924 
„ 1985 
„2027 


` „2057 
2.2080 


„2098 
„2112 
„2152 
„2171 

7 


20.0 ` 


40757 . 


„0957 
. 1077 
.1155 
. 1209 


. 1258 ` ` 
11 1278 


„1301 


.1319 ` ` 
„1333 


. 1377 
. 1398 
+ 1447 : 





E 


Р" 
=> 2 со іс (n ₪ ₪ =~ 


N 
о 


8 


. 5000 


. 5333 
„5003 
„4856 
.4773 
.4719 
.4682 


.h655 


.4634 
shól8 
„3605 


«4507 


„3333 


25215 
~4666 
25426 
2.4294 
„4212 


2.4157: 


„+118 


.Н089 
4067 


95050 
95965 


. 395%. 


0 
„5037 
با„‎ 0 
„ Oh 3 
. 3884 
„3788 
. 3725 
.3683 
„3652 
„3629 
„3612 
. 3549 
. 3524 


26 


„4329 
„32717 
„2858 
22657 
„2527 
22451 
0 
. 2364 
„2338 
„2319 
27 
„2217 


„0500 


. 3907 


2714 . 


. 2239 


‚ +199? 
. 1854 


‚ 1761 


1698 . 


• 1652 


1617 ° 


.1592 
. 1492 
o 1447 


6\ א 


O NY O U QU N — 


"15 
20 
25 
30 
55 
40. 
50 
50 
70 
80 
90 

100 


СО. 


.5000 
25000 
.5000 
"5000 
„5000 
„5000 
„5000 
„5000 
„5000 
„5000 
„5000 
15000 
.5000 
.5000 
„5000 
„5000 
„5000 
„5000 
„5000 
0 
„5000 
„5000 
„5000 


TABLE 4 


М 
- 


LINEAR CISCRIMINANT FUNCTION 
LY BE AEAT IBAs 
ve | 

20 3.0 4.0 5.0 
4000 .3262 ۰27۷1 .2359 
2.3637 .2652 .2006 8 7 
iso] 0292 Чо В. 6 
.3207 .2058 .1393 ,ዐ0985 
.3062 .1898  .1252 .0861 
.2945 ۰.178۷ .1161 0. 
۰2818 .1702 .1099 ؛‎ 1 
۰2768 .1642 „1056 6۵ 
4017501: aiso? Sio: 73990581 
.2643 .1562  .1000 ۰ 1 
.2460 38.1473 ۵ ی‎ . 0606 
.2369 .1u39 .0907 9 
22322 ۰.1342 №0890 №. 0563 
.2296 —.1109 .0879 2 
Боза от Oo C OS 
.2271 ۰.1395 .0864 8 
.2260 .1387 .0856 0 
22255 .1381 .0850 5 
2251 ео ЩЩ 
22۷9 1зт OO 002i. 
mour N DOGS о 
.2245 .1370 .0838 | %ו05.‎ 
.2231 .1353 .0821 .0498 


10.0 


„ 1385 
„0627 
. 0365 
. ህረ 25 
. 0197 
. 0164 


„0127 


SONAS 
„0105 
. 0082 
„ 0072 


۰ 006, 


- 0060 
. 0057 
. 0055 
. C052 
. 0050 
| 0019 
. 0048 


‚0047 


,646 
. 0011 


2C.C 


. 0757 
. 0209 


. 0083 . 


„0043 
„0026 
„0017 
. 0012 
„0009 
„0007 
„0006 
„0003 
„0002 
„0001 
„0001 
„0001 
„0001 
„0001 
0001. 
۱1 
„0000 
„0000 
:0000 
„ 0000 





ج 
< 


D NM O NEUN = 


15 
20 
25 
۲ 0 
35 
40 
50 
60 
70 
80 
“90 
100 


.0 


2 2212 
„52965 
29278 
‚5265 
„5256 
“2251 
05247 
25245 
20244 


25244 
5217 
. 2250 
. 5254 
. 5256 


5256 


„5262 
05264 
05266 
„5267 
. 5268 


0526S. 


. 5277 


22522 


05214 
]2504 
آ باو باء 
4893. 
4860. 
25841 
04829 
04822 
04819 


۰۷ 8 


4823„ 
24831 
24838 
24842 
24846 
8 با 8 با , 
4852„ 
24854 
24856 
25857 


44858. 


. 4859 
04833 


TABLE 4 


230 21 
.2 


'shééé 


13 ۷6 , 
8. 
8. 
6. 
44577 
4580« 
با58 با. 
۱ با . 
44612 
44619 
4624« 
44627 
5630. 
046223 
66. 
shé3?‏ 
04639 
446040 


PETI 


«4647 
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LRO 
TICNS 

e 2CCO e 10‏ 
43265„ ۰۷0 
est? ۰ ۷ 7 6‏ 
1053 . 4469 . 
3 ۰۷ ۷59 ۰ 
e4C9E‏ 15 44. 
4115. 4415 . 
4131 . 5 . 
25142 1 3 با با , 
4152 . 8 بابا. 
606 ابا . ہا پا ہا با م 
4182. 5. 
e4195‏ 4477 
4202 . با ع با با ے 
6 ۰۷20 8 8 با با , 
216با. 2 وه 
4212„ با وبا با , 
6 1 2 ۰ 8 بابا.ء 
el SCC «421€‏ 
4220 . 2 5 ۰ 
e422)‏ 3 5 ۰ 
e45C4 „4222‏ 
24223 4505 . 
04231 512 4. 


e CSCO 


.39C7 
„3813 
„3866 
„3912 
, باب و5‎ 
„3966 
„3982 
. 3995 
. 4 OCH 
„012 
. 4036 


224058 


° 4055 
e 4 CóO 
. 4063 
. ЦС 06 
. 4 СТО 
۰۷2 
e4 C74 
. 4075 
. ЦС 76 


4077 . 
با ۵8 با . 


= 
< 


O Y Con FF vw N — 


D 0 - O ጅ U ዚህ N N — — 
о о о O co O (л © (л ር2 ኢኪ © D 


100 


«5000 


«5000 
„5000 
. 5000 
„5000 
„5000 
. 5 000 


42000. 


„5000 
„5000 
„5000 
„5000 
„5000 


„5000 
„5000 
„5000 
„5000 
„5000 
„5000 
.5000 
„5000 


.3598 
Ex 
.2836 
.2639 
.2498 
„2396 
„2319 
„2261 
Pon" 
62150 
.2092 
.2058 


.20%2 


.2033 


`... 
E2022 


EOS 


КОО? 


„2009 
„2007 
„2005 


2004 


SCRIMINANT 
nhất BỘ GER 
በ፲ 2 

3.0 4.0 
2253 1851 
.1839 4.1133 
.1508 .0 
. 1329 .0717 
1226800902 
.1162 „0600 
סקוו.‎ 0571 
.1090 9 
.1069 ۵ ۵۳ 3 
51052 "70520 
.1006 0H81 
.0984 ۰. 2 
-09 70. 720350 
.0961 2.0443 
„0955 20537 
0950 0155 
.0943  .0h27 
.0939 =. 0423 
.0935 OU 
.0933 „0419 
.0931 ۵ ۲ 
20929 ۰۵ 


29 


— 


. 1404 
«07343 
„0505 
. 0406 
„0353 
„0320 
„0298 
. 0281 
„0269 
„0259 
“0229 
„0215 
20207 
„0201 
„0197 
20195 
„0190 
. 0187 
„0185 
.0184 
.0183 
.0182 


- 10.0 


. 0512 
.0138 
. 0063 
- 0038 
. 0026 
۰ 0020 
„0016 
„00135 
„0011 
. 0010 
. 0005 


„0005 


. 0004 
„ 000 
. 0003 
. 0003 
.0003 
„0003 
„0003 
„ 0003 
. 0003 
. 0002 


208 


„0158 
20017 
„000% 
„0001 
„0001 
„0000 
„0000 
„0000 
„0000 
„0000 
„0000 
„0000 
„0000 
„0000 
„0000 
„0000 
„0000 
„0000 
„0000 
„0000 
„0000 
„0000 





100 


. 2000 


.9 
.4 782 
©4658 
+4580 
„527 
.4491 
«4466 
eG LB 
„+435 


۰ ۱۷2 6 ' 


25409 
25407 
2.4809 
25410 
25512 
25413 
5 با با , 
24416 
25417 
E‏ 
25518 
24418 


e487 


. 77 


. 396 
. 3877 
. 386 
. 3992 
«5827 
. 3826 
.3826 
- 552 
3039 
23831 
235840 
23841 
. 3842 


.3883 


. 3845 
„3815 
«35856 
25846 


43847 
.3847 


. 407] 
„3678 
„3566 
„5553 
„5526 
«35526 


2.3528 


„3530 
„3533 
„3535 
„3541 
23544 
23546 
„3547 
„3548 
23549 
23550 
.3550 
.3551 
.3551 
„3552 
„3552 
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„FUNCTION 
TIONS 

.2000 ዓ1000 

1177 TOS 

. 3۷253 ۰ ۰.6 

.3354 .2979 

. 3315 ۰ 

.33%% .2991 

.3388 „2994 

.3351 .2997 

„335% ,2999 | 
.3356 0 

' 3001. 3358. 
۰300 363ة3. 

.3366 5 

.3367 4.6 

25505 3008 

.3369 .3007 

. 3370 8 

.3371 ۵ 8 

.3371 8 

.3371 .3008 

2.3372. .3009 

.3372 ` .3009 

.3372 9 


. 0500 


‚2807 
22794 
. 2805 
. 2812 
.2815 
„2817 
„2819 


,2820 


.2821 
.2821 
.2823 
. 2824 
. 2824 
#2825 
22093 
. 2825 
„28625 
„2825 
. 2826 


' e 2026 


.2826 
. 2826 


< 


о Om uU F&F Q ND > 


0” DAN OW F WW Ñ N = — 
O O O O O O G O Qn O іл о о 


100 


. 3000 


.5000 
„5000 
„5000 
„5000 


.5000 


„5000 
„5000 
„5000 
„5000 
„5000 
„5000 
„5000 
„5000 
„5000 
„5000 
„5000 
. 3000 
„5000 
„5000 
„5000 


LINEAR DIS 
م‎ 

R 

2720 3.0 
.3266 .1998 
.2719 .1317 
۰.2۷12 ۰6 
22223 8 
۰2100 ۰ ۰089 
۰2018 .0807 
.1961. ۹ 
۰1920 .0758 
„1891 ۵ 3 
1869 0 
۰1815 — .069u 
.1794 ۵ ۰5 
۰1782 .0664 
.1774 7 
.1769. ۱ 
.1765 ۰8 
.1759 ۰ ۰2 
.1755 ۵ ۰8 
01752 6 
۰1750 ۰ 
417,49 2 
.17u7 7.0631 


e 1278 
. 0675 
. 0485 
. 0403 
. 0359 
. 0331 
. 0312 
• 0298 
. 0287 
• 0278 
. 0252 


. 0240 


0232 . 
0227 • 
0224 . 
ו022. 
0217 • 
0215 • 
0213 • 
0212« 
90211 


31 


. 0859 


„0368 
«0237 
20184 


240156 


. 0138 
. 0126 
20117 
„0110 
„0105 
„0090 
„0083 
„0078 
„0076 
. 
„0072 


„0070 - 


„0069 
„0068 
„0067 
„0067 
„0066 


10.0 ' 


e 0034 
„0012 


. 0006. 


e 0004 
. 0003 
e 0002 
• 0001 
50001 
• 0001 
• 0001 
. 0009 
. 0000 


. 0000 ” 


• 0000 
• 0000 
• 0000 


„0002 
_ «0000 


e 0000 
• 0000 
• 0000 


20.0 


.0035 
.0002 
.0000 
„0000 
۰0000 


0000 
7 40000. 


• 0000 
• 0000 


`.0000 


„0000 


„0000. 


„0000 


„0000 ` 


„0000 
„0000 
„0000 
„0000 
„0000 
„0000 


` e0000 


„0000 





E 


D YY O ጊስ + Ww N ~ 


D 0 3 O (л £ WwW ርህ ኮን NN = په‎ 
O O O O O O {їл © ₪ ₪ о 


100 


92000 


.466С 
.6 
.4 158 
‚1056 
93597 
„3960 
„3937 


„3923 
3913 


.3908 
„3899 
• 3500 
„3901 


„3902. 


„3903 
„3903 
-3904 
. 3904 
.3905 
.3505 


_.3905 


„3905 


N---- 


. 3876 
23443 


„3307 


„3260 


. 3242. 


„3236 
23234 
23233 
„3233 
„3233 
„3233 


.3233 


„3233 


235233 
| .3233 
23235 


93232 
.3233 
„3233 


3233 
.3233 
53233 


“3361 
° 3004 
‚2931 
„2913 
. 2908 
2905 
.2904 
.2903 
.29C2 
.2902 
.29С0 


TABLE 4 


2899. 


„28598 


22898 


„2897 


„2857 


„28917 
۰27 


` „2896 


‚2896. 


„2896 


„2896 


22 


FUNCTION 
LES 
ICNS 
.2000 .1000 
e304] — .2uT1 
. 2770 14 
„2128 ۰ ۰3 
.2718 1 
.2713 ۵ ۰ 3 
. 2710 2328 
„2708 .2324 
. 270 ۰۰۱ 
. 2705 8 
. 2701 ۰ ۰6 
ו276.‎ 2316 
2.2659  .2307 
۰.2698 2305 
۰.2698 53 
» 2697 ל‎ 
‚2697 72 
„26856 `` „2301 
۰2696 46 
„26955 ۰ ۰.2299 
. 2695 „2299 
¿2695 .2299 
22298 


. 2695 


.05C0 


‚2260 
„2159 
„2173 
„2158 
„218 


„2136 


„2132 


„2129 
„2126 
„2119 
„2115 
„2112 
°2111 
„2109 
„21С8 
„2107 
„2106 
۰2 6 
«2105 
„2105 
„2105 
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O N oa un Р UNUN = 


ውፍ WN Ғғ Ww Ww N N ~ ہے‎ 
O O O Q O (л O Q O о 


1.0 


«5000 


.5000 


. 5000 
«5000 
. 2000 
. 5000 
92000 
42000 
42000 


‚+5000 
92000 


„5000 
„5000 
„5000 
„5000 
„5000 
„5000 
„5000 


TABLE Y 


LINEAR DISC 
ER 


FOR GAMM 


2.0 


e2974 


.2379 


. 2076 


2.1904 


1801 
„1736 


_„1693 
.1663. 


„1612 
„1526 


21586 
21567. 
‚ „1556. 


. 1549 


„1500 
221540 
21534 


„1531 


„1588 
.0964 
. 0750 
. 0656 
«0606 


‚ 0574 


.0552 
.0536 
.0524 
20514 
20484 
20469 
„0460 


. eQh5h 


. 0449 
«0446 
20442 
.0ц39 


RIMI 
ROR PROBA 
А РО 


2 0892 
. 0417 
. 0289 
.0235 
. 0206 
. 0187 


۰.17 


. 0165 
. 0158 
90152 
. 0135 
. 0127 
. 0122 
. 0119 
. 0117 
0115 
„0113 
„0111 


„0533 


20193 


„0117 
„0087 
0071 
„0061 
. 5 
0050 
. 0046 
. 0044 
. 0036 
. 0032 
„0030 
. 0029 
„0028 
„0027 
„0027 


„0026 


10.0 


. 0078 
• 0009 
. 0003 
• 0001 
• 0001 
. 0000 
• 0000 
e 0000 
• 0000 
• 0000 
20000 


^. 0000 


. 0000, 
„ 0000 
„0000 
„0000 
„0000 
„0000 


20.0 


_e0008 


„0000 
„0000 
„0000 

„0000“ 


. 0000 | 
«0000 , 


. 0000 


, «0000 


„0000 
„0000 
„0000 
„0000 


¬. 


„0000. 
„0000 
„0000 
„0000 


"4 


O NN O 40M £ QU N w 


о лг (м ህህ > 
O O O Q O Q O mn © ж 


.0 


.4345 


.3748 
„3650 
493297 
„3568 
„3591 
. 3541] 
„3536 
. 3532 
„3528 
. 3528 
„3528 
„3528 
„3528 


.3528 
223528 
3528 


.3333 


„5383 
۰2 7 
2861 
„2828 


„2815 


„2809 


‚ +2806 


„2803 
„2802 


„2800 


„2795 
„2793 
„2792 
„2791 
„2790 
„2789 
„2789 


.2788 


. 0 1 


«2844 
. 2547 


۰5 
.2465 
42459 
02454 
0245) 
22448 
.2446 
„2439 
02435 
024323 
22432 
22531 
©2430 


`.2429 
.2428 


. 2543 
• 2330 
. 2290 
02273 
„2262 
0 2254 
. 2249 
. 2245 
o 2241 
‚2239 
„22130 
„2226 
° 2223 
„2222 
„2220 


.2219 


„2218 


.2217 — 


06 


.2062 
.1951 
„191C 
. 1887 
. 1872 
„1862 
„1855 
21845 
21845 
21841 
„1830 
. 1824 
. 1821 
21818 
21817 
. 1815 
21814 


2.1812 


„0500 


„1885 
. 1777 
e 1730 
. 1704 
. 1688 


„16176 


„1668 
„1661 
„1656 
. 1652 
. 1610 
.1633 
„1630 
„1627 
„ 1625 
„162% 
„1622 


۰1620 7 


. 5000 
. 5000 
. 5000 
. 5000 


„5000 


„5000 . 
.5000 ` 


„5000 


LINE 


F 


22714 
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TABLE ዛ 


S 


AR CI 
ERROR 
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G 
А = 
3.0 
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.04 39 
.0365 


R = 


3.0 


„1019 
00542 
„0321 
20261 


Do 
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MA PO 


> פס טר 
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5 
4.0 


. 0629 
. 0264 
.0120 
. 0084 
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„0170 
• 0071 
e 0047 
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S 
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„0105 
. 0033 
. 0018 
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« 0000 
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• 0000 
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TABLE h 


НЕР 5000. № .3333 .2500 .2000 ۰1000 ۰0508 


КОО 2984 №. 2455 22181 21759 565 
2 О-В. 2607 | .-2208 .2000 1927 21458 
5 58227. 210270 .2121 ۰۱9۱ 52 aus 
በ... 2159 —.2093 .1882 . 1488 7 

50 .3227 0 


N\C .5000  .3333 .2500 .2000 .1000 .0500 


| اا‎ ро 2:5: 253 ЕО 
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.3026 .2209 „1843 „1637 .1259 0 
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„5000 
„5000 
„5000 


1.0 
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„5000 
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2268„ 
1642„ 
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2.0237. 
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. 0015 


S 


„0136 
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TABLE 4 


LINEAR DISCRIMINANT FUN 
ERROR PROBABILITIES 
FOR GAMMA POPULATIONS 
R= 
NNC .5000 .3333 2500 .2000 .1000 0509 
l 63555 230 651911. 1682 3%. 13221. 11721 
2 .3073 .2087 .1715 .1517 .1163 .1008 
5 .2809 ۰1979 ۰1612  Á.1409 ,]0hó 0888 
10 „21775 ` 1945 .157Ң ۰.1369 .1003 08 
В = 
МС 5000 1.3333 0.2500 .2000 .1000 .0500 
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2 ۰2857 7.1890 .1526 .1333 .0992 6 
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„2588 


. 
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— «5000 


1,0 
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EUR 
LINEAR DISCRIMINANT FUNCTION 
ERROR PROBABILITIES 
FOR GAMMA POPULATIONS 
R 9 
2.0 3.0 4.0 5.0 
„190% 4.0540 .0166 0057. 
۰1308 ۰ .0247 „0049 .0010 
„0961 0130 ۰0015 2 
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в = 10 
2.0 3.0 4.0 5.0 
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TABLE. 5 
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R = 10 
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LINEAR CISC 
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R 
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„ 0065 
• 0015 
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„0000 
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„5000 


„2633 
„2212 


„2009 


25333 


„1652 
0 1439 
. 1324 


#3333 
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20 
ዘ 


. 1270 
. 1099 
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e 0000 
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e 0000 
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e 0000 
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e 0000: 
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TABLE 4 
CRIMINANT FUN 
PROBABILITIES 
MA POPULAT IONS 
в = 13 
.2500  .2000 
21060 .0891. 
.089% .0731 
R = 14 
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20971 0808 
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90220 
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1.0 2.0 3.0 4.0 
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.5000 .0504% .0023 .0001 
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1.0 2.0 3.0 4.0 
„5000 .1063 .0136 - 0020 
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„00M7 
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.0000 
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. 0003 


_.0000 
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„ 0000 
„ 0000 
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20000. 
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940000 


. 0000. 
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TABLE 4 


LINEA 


R CISCRIMINANT FUNCTION 
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в = 17 í: 
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1 5000 .0981  .0113  .0015  .0002 .0000 0 
2 .5000 .0587 .0037 2.0002  .0000 ..0000 ۹6 
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2 65000 .0536 .0029 ۰1 0000 2.0000 ۰ ۰0 
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2026 
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22000 
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TABLE L 
R DISCRIMINANT FUNCTION 
ERROR PROBABILITIES 
FOR GAMMA POPULAT IONS 
ER- 32 
3333 ..2500 .20060 ۰ ۰0 
„1054  .0753  .0609  .0385 
.0882 .0601  .0h68  .0269 
) | ч 
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.0816° ۰05۷6 .08|9 .0234 
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„0171 


2% 
ያ” 


- 


N\C 
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45000 
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2.0 


.0775, 
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TABLE H 
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ERROR PROBABILIT 
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. 0235 


.0147 


„0500, 


0207 


| .0126 
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SECTION IV 


SUMMARY AND CONCLUSION 


Section II of this paper briefly summarizes some 
of the work accomplished by Hodges and Fix in [3]. 
Their investigation was concerned with the computation 
of the probabilities of misclassification for various 
nonparametric procedures assuming some parametric form 
of the distribution which describes the populations. The 
error probabilities for the "optimum" parametric procedure 
were also computed and compared with the nonparametric 
error probabilities. The investigation considered the 
two population classification problem when the popula- 
tions have normal distributions with equal covariance 
matrices. The parametric procedure employed was the 
linear discriminant function which is the appropriate 
method in this situation, and the primary nonparametric 
procedure considered was the "rule of the nearest neigh- 
bor." The above two procedures were compared by computing 
the probabilities of misclassification. The results of 
this investigation indicated that the "rule of nearest 
neighbor" gave "reasonable" error probabilities. 

Section III also considers the two population 
classification problem, but the investigation is primarily 


concerned with the performance of the linear discriminant 
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function if the actual densities which describe the 
populations are not normal, but in fact gamma with 
density functions defined by formulas (5) and (6) of 
Section III. Also included in Section III is a 

limited investigation of the "rule of nearest neighbor" 
when the populations are assumed to be exponential. 
Evaluation of the performance of both the linear discri- 
minant function and the "rule of nearest neighbor" 

was accomplished by computation of the probabilities of 
misclassification. 

When the population densities are assumed to be 
exponential, Table 3 and Table 4, for the case r = 1, 
provide a means of comparing the performance of the 
linear discriminant function and the "rule of nearest 
neighbor." An examination of these tables indicates that 
both procedures can result in "high" probabilities of 
error, particularly when c assumes values near one, 
since for small sample sizes, both procedures can result 
in error probabilities which are greater than + . 
Although even as n, the sample size from each popula- 
tion, tends to infinity, the linear discriminant func- 
tion has error probabilities greater than + for 
| רו‎ 57 c < l, it is of interest to note that 
"the rule of nearest neighbor" in this situation will 
always have error probabilities less than or equal to 


+. Also, depending upon the importance of each type of 
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error, it is possible for the linear discriminant func- 
tion to be a "fairly useful" procedure since one error 
probability is usually "small." “Table 4 also shows 
that as r increases, the probabilities of misclassifi- 
cation decrease. This result was anticipated Since for 
increasing r, the gamma distribution approaches the 
normal distribution by the Central Lamit Theorem. 
The following recommendations are made on the basis 
of this paper. 
(1) Investigate the performance of the nonparametric 
procedure, using k = 3 instead of the "rule of 
nearest neighbor,’ k = 1. 
(11) Investigate the performance of the nonpara- 
f metric procedures proposed by Hodges and Fix 
in [2] employing different distance functions. 
(111) Develop a more satisfactory computational 
formula for the linear discriminant function 
when the populations are assumed to be gamma 
in the situation when r and n are large since 
the formula used in this paper required many 
hours of computer time. 
(iv) Investigate the performance of the linear 
| discriminant function and other nonparametric 
procedures for other distributions. A cursory 
investigation was made for the beta distribu- 
tion and the analysis appears to be more diffi- 


cuit. 
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(v) Compare the performance of Bayesian parametric 


and nonparametric classification procedures. 
(vi) Investigate the classification problem when 


there are more than two populations. 


54 





BIBLIOGRAPHY 


Tomas E. Jro, Di Secrimanateon Procedures,‏ בר 
Small Sample Performance," Thesis at United States‏ 
Naval Postgraduate School, 19603.‏ 


| UE cg 1 3100 Hodges, J. 2517 UNomparametric 
Discrimination. I. Consistency Properties," 
Technical Report prepared under contract between 
University of California and School of Aviation 
Medicine, Randolph Field, Texas, 22 pp. 


ዜ.. cC [yn апа Hodges,- J: L., Jr., "Nonparametric 
Discrimination: Small Sample Performance," 
Technical Report prepared under contract between 
University of California and School of Aviation 
Medicine, Randolph Field, Texas. 


Hodges J Ur Discriminatory Analysis. ІС 
Survey of Discriminatory Analysis," Technical 
Report prepared under contract between University 
of California and School of Aviation Medicine, 
Randolph Field, Texas. 


33 























